Skip to content

fix: handle replacement txs in TxStore#1074

Open
technicallyty wants to merge 11 commits intomainfrom
technicallyty/recheck-snapshot-invalidation
Open

fix: handle replacement txs in TxStore#1074
technicallyty wants to merge 11 commits intomainfrom
technicallyty/recheck-snapshot-invalidation

Conversation

@technicallyty
Copy link
Contributor

@technicallyty technicallyty commented Mar 11, 2026

Description

updates the TxStore to handle tx replacements. txs that get replaced as a result of an insert should invalidate all same-sender txs with a higher nonce value, as they were dependent on the original tx's execution.

Closes: STACK-2479


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • tackled an existing issue or discussed with a team member
  • left instructions on how to review the changes
  • targeted the main branch

@linear
Copy link

linear bot commented Mar 11, 2026

@greptile-apps
Copy link

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR updates two tx stores — TxStore (EVM/legacypool) and CosmosTxStore (Cosmos mempool) — to properly handle transaction replacements by cascading invalidation to all same-sender, higher-nonce transactions that were dependent on the replaced tx.

Key changes:

  • TxStore.RemoveTx now delegates to new RemoveTxsFromNonce(addr, minNonce), removing all txs with nonce >= minNonce instead of just the single matched tx. Previously noted issues with t.total not being decremented and memory leaks from retained backing-array pointers have been fixed in this revision.
  • CosmosTxStore is rewritten to use a signer/nonce keyed index instead of pointer identity, enabling InvalidateFrom to detect replacements and evict stale dependent txs from the snapshot.
  • RecheckMempool.Insert now calls markTxInserted (new) instead of markTxRechecked. On replacement, the snapshot is cleared of stale txs; the replacement itself is intentionally withheld from the snapshot until the next recheck cycle validates it.
  • A markTxReplaced helper in legacypool explicitly documents that replacement txs are excluded from validPendingTxs until rechecker validation.
  • Tests across all three packages are updated or added to cover the new semantics, including an integration test (TestMarkTxRemovedInvalidatesPending) exercising the full replacement + reset flow.

Minor issue found: AddTxs in legacypool/tx_store.go now writes each hash to t.lookup twice — once in the first loop (new line) and once in the pre-existing second loop — making the second loop redundant dead code.

Confidence Score: 4/5

  • This PR is safe to merge with minor cleanup; the replacement cascade logic is correct and well-tested.
  • The core logic is sound: memory leaks and counter drift previously flagged in review have been addressed. The one remaining issue is a redundant double-write to t.lookup in AddTxs (harmless but dead code), and a potentially non-deterministic slice-order assertion in one new test. Neither blocks correctness in production.
  • mempool/txpool/legacypool/tx_store.go (redundant second lookup loop) and mempool/recheck_pool_test.go (strict slice-order assertion after recheck).

Important Files Changed

Filename Overview
mempool/txpool/legacypool/tx_store.go Adds RemoveTxsFromNonce which removes all txs with nonce >= minNonce, and RemoveTx now delegates to it. Memory leak fix (clear) and t.total decrement are correct. Minor: AddTxs now double-writes to t.lookup (redundant second loop).
mempool/tx_store.go Rewrites CosmosTxStore to use a keyed index (signer/nonce) instead of pointer identity. Adds InvalidateFrom for replacement-aware invalidation. Duplicate-key guard logs a warning and returns (previously panicked). Logic is sound.
mempool/recheck_pool.go Replaces markTxRechecked call in Insert with new markTxInserted, which calls InvalidateFrom before optionally adding to snapshot. Replacement txs are intentionally excluded from snapshot until next recheck cycle.
mempool/recheck_pool_test.go Adds two new tests: one verifying rechecked-snapshot invalidation on replacement, another verifying full recheck rebuild. Strict slice-order assertion on recheck result may be non-deterministic.
mempool/txpool/legacypool/tx_store_test.go Tests updated to match new cascade-removal semantics. TestTxStoreRemoveTx correctly expects 0 remaining txs after removing nonce 0. New TestTxStoreRetainsPreviousTxs verifies txs before the removed nonce are retained.
mempool/txpool/legacypool/legacypool_test.go Adds TestMarkTxRemovedInvalidatesPending integration test covering the full replacement flow: replace nonce 4, assert 5+6 evicted, reset, verify all 3 return after recheck.
mempool/tx_store_test.go Adds keyedMockTx type for testing key-based deduplication. Updates TestCosmosTxStoreDedup to use keyed tx so the new keys map participates in the test.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant RecheckMempool
    participant CosmosTxStore
    participant LegacyPool
    participant TxStore

    Note over Caller,TxStore: Replacement tx insert flow

    Caller->>RecheckMempool: Insert(replacement tx @ nonce N)
    RecheckMempool->>RecheckMempool: markTxInserted(tx)
    RecheckMempool->>CosmosTxStore: InvalidateFrom(tx)
    alt tx key exists in snapshot (is a replacement)
        CosmosTxStore->>CosmosTxStore: remove all stored txs with nonce >= N
        CosmosTxStore-->>RecheckMempool: removed > 0
        Note over RecheckMempool: replacement NOT added to snapshot yet
    else tx key not in snapshot (fresh insert)
        CosmosTxStore-->>RecheckMempool: 0
        RecheckMempool->>CosmosTxStore: AddTx(tx)
    end

    Caller->>LegacyPool: add(replacement tx)
    LegacyPool->>LegacyPool: markTxRemoved(addr, old tx, Pending)
    LegacyPool->>TxStore: RemoveTx(addr, old tx)
    TxStore->>TxStore: RemoveTxsFromNonce(addr, nonce N)
    Note over TxStore: removes old tx AND all txs with nonce >= N
    LegacyPool->>LegacyPool: markTxReplaced(addr, replacement)
    Note over LegacyPool: replacement NOT added to validPendingTxs

    Note over Caller,TxStore: Next recheck cycle
    RecheckMempool->>RecheckMempool: TriggerRecheck
    RecheckMempool->>CosmosTxStore: rebuild snapshot via markTxRechecked
    Note over CosmosTxStore: replacement + dependent txs re-added after validation
Loading

Comments Outside Diff (1)

  1. mempool/txpool/legacypool/tx_store.go, line 113-128 (link)

    Redundant double-write to t.lookup in AddTxs

    This PR adds t.lookup[tx.Hash()] = struct{}{} at line 116 (inside the first loop), which is correct and enables within-batch deduplication. However, the pre-existing second loop at lines 125–128 (for _, tx := range toAdd { t.lookup[tx.Hash()] = struct{}{} }) now overwrites the exact same entries that were just written in the first loop. The second loop is redundant and should be removed.

    While this is harmless (idempotent map writes), it's dead code that adds cognitive overhead for future maintainers.

Last reviewed commit: facec32

@technicallyty technicallyty marked this pull request as draft March 12, 2026 00:04
require.Equal(t, uint64(1), result[addr1][0].Tx.Nonce())
}

func TestTxStoreConcurrentRemove(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this test makes sense to have anymore


result := store.Txs(txpool.PendingFilter{})
require.Len(t, result[addr1], 500)
require.Len(t, result[addr1], 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating this bc i think the behavior we want now is if a tx is removed, all txs with a greater nonce should now be considered invalid and removed as well. they cannot be included in a block proposal.

@technicallyty technicallyty marked this pull request as ready for review March 12, 2026 04:06
@technicallyty
Copy link
Contributor Author

@greptile re-review

@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 83.33333% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.51%. Comparing base (43a1f17) to head (facec32).

Files with missing lines Patch % Lines
mempool/tx_store.go 83.33% 4 Missing and 8 partials ⚠️
mempool/recheck_pool.go 71.42% 1 Missing and 1 partial ⚠️
mempool/txpool/legacypool/tx_store.go 88.23% 0 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1074      +/-   ##
==========================================
+ Coverage   65.45%   65.51%   +0.06%     
==========================================
  Files         331      331              
  Lines       23262    23347      +85     
==========================================
+ Hits        15225    15296      +71     
+ Misses       6894     6887       -7     
- Partials     1143     1164      +21     
Files with missing lines Coverage Δ
mempool/recheck_pool.go 82.10% <71.42%> (-2.24%) ⬇️
mempool/txpool/legacypool/tx_store.go 92.85% <88.23%> (-7.15%) ⬇️
mempool/tx_store.go 86.00% <83.33%> (-14.00%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aljo242
Copy link
Contributor

aljo242 commented Mar 12, 2026

@greptile re-review

@technicallyty
Copy link
Contributor Author

@greptile re-review

Copy link
Contributor

@mattac21 mattac21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some questions but looks good to me

// cosmosTxNonceMap extracts the signers from the transaction
// and returns a signer -> nonce map.
func cosmosTxNonceMap(tx sdk.Tx) (map[string]uint64, bool) {
signerSeqs, err := extractSignerSequences(tx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a limit on the amount of signers a tx can have that is validated somewhere before this? if not then this is a user supplied value we are allocating memory on which isn't good (i.e. someone sends us a tx with 5 mil tiny invalid signers) (I can't comment on the line since it isn't changed, but extractSignerSequences line 487)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mm i guess we are running ante handlers just before this, so that would probably be validated away in there

if i > 0 {
b.WriteByte('|')
}
fmt.Fprintf(&b, "%s/%d", sig.account, sig.seq)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess same question on the number of max signers allowed here. could someone force us to allocate a massive string here? if so maybe we have to hash a key made of account + seq together with other signers keys in order to have a fixed sized key? but that seems way slower

removed := 0
nextTxs := make([]sdk.Tx, 0, len(s.txs))
for _, existing := range s.txs {
if invalidatesCosmosTx(existing, thresholds) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we compute the nonce map in invalidatedCosmosTx but we have already computed it in InvalidateFrom, could we pass it in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you referring to the nonceMap on line 68? that one is the nonceMap for the tx we are checking, the one here is for each tx we check against


// rebuild the txs list, skipping txs that are invalidated.
removed := 0
nextTxs := make([]sdk.Tx, 0, len(s.txs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

damn this feels super rough that we have to rebuild the entire list again. do you think it would be better if we changed the data structure of the tx store here to be more like the evm one which is a map? so we only have to rebuild per account

s.txs = txs
s.keys = make(map[string]int, len(txs))
for i, tx := range txs {
if key, ok := cosmosTxKey(tx); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computing this key again seems really rough, would love if we could maybe cache it on the tx itself, kind of like how signatures are cached on eth transaction structs, but I agree probably fine to just leave this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants